Scientific Data Mining in Astronomy

نویسنده

  • Kirk D. Borne
چکیده

We describe the application of data mining algorithms to research problems in astronomy. We posit that data mining has always been fundamental to astronomical research, since data mining is the basis of evidencebased discovery, including classification, clustering, and novelty discovery. These algorithms represent a major set of computational tools for discovery in large databases, which will be increasingly essential in the era of data-intensive astronomy. Historical examples of data mining in astronomy are reviewed, followed by a discussion of one of the largest data-producing projects anticipated for the coming decade: the Large Synoptic Survey Telescope (LSST). To facilitate data-driven discoveries in astronomy, we envision a new data-oriented research paradigm for astronomy and astrophysics – astroinformatics. Astroinformatics is described as both a research approach and an educational imperative for modern data-intensive astronomy. An important application area for large timedomain sky surveys (such as LSST) is the rapid identification, characterization, and classification of real-time sky events (including moving objects, photometrically variable objects, and the appearance of transients). We describe one possible implementation of a classification broker for such events, which incorporates several astroinformatics techniques: user annotation, semantic tagging, metadata markup, heterogeneous data integration, and distributed data mining. Examples of these types of collaborative classification and discovery approaches within other science disciplines are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Vision for PetaByte Data Management and Analysis Services for the Arecibo Telescope

We survey the initial steps of a project to build a data management and data mining system for astronomy data generated by the Arecibo Telescope. The total amount of data that our project will have to manage will approach one Petabyte over five years. We describe some of the scientific challenges from the astronomy side, and we discuss initial thoughts on how to address these challenges through...

متن کامل

Data Mining and Machine Learning in Astronomy

We review the current state of data mining and machine learning in astronomy. Data Mining can have a somewhat mixed connotation from the point of view of a researcher in this field. If used correctly, it can be a powerful approach, holding the potential to fully exploit the exponentially increasing amount of available data, promising great scientific advance. However, if misused, it can be litt...

متن کامل

ar X iv : 1 20 1 . 18 67 v 1 [ as tr o - ph . I M ] 9 J an 2 01 2 Astroinformatics , data mining and the future of astronomical research

Astronomy, as many other scientific disciplines, is facing a true data deluge which is bound to change both the praxis and the methodology of every day research work. The emerging field of astroinformatics, while on the one end appears crucial to face the technological challenges, on the other is opening new exciting perspectives for new astronomical discoveries through the implementation of ad...

متن کامل

Massive Datasets in Astronomy

Astronomy has a long history of acquiring, systematizing, and interpreting large quantities of data. Starting from the earliest sky atlases through the first major photographic sky surveys of the 20th century, this tradition is continuing today, and at an ever increasing rate. Like many other fields, astronomy has become a very data-rich science, driven by the advances in telescope, detector, a...

متن کامل

A Brief Survey On Data mining For Biological and Environmental Problems

In the past, many researchers used data mining techniques in any area. A lot of amounts of data have been collected from scientific domains such as geo sciences, astronomy, meteorology, geology and biological sciences. Data mining techniques and tools used by researchers in biological and environmental problems also. In biological science data mining used in sequences alignment is based on the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0911.0505  شماره 

صفحات  -

تاریخ انتشار 2009